The Advantages and Disadvantages of Statistical Disclosure Limitation for Program Evaluation
نویسندگان
چکیده
This paper formalizes the manner in which statistical disclosure limitation (SDL) hinders empirical research in economics. We also highlight a hitherto unappreciated advantage of SDL, formal privacy models, and synthetic data systems: they can serve as a defense against model overfitting and false-discovery bias. More specifically, a synthetic data validation system can – and we argue should – be used in conjunction with systems in which researchers register their research design ahead of analysis. The key insight is that privacy-protected data can be used for model development while minimizing risk of model overfitting. To demonstrate these points, we develop a model in which the statistical agency collects data from a population, but publishes a version in which the data that have been intentionally distorted by some SDL process. We say the SDL process is ignorable if inferences based on the published data are indistinguishable from inferences based on the unprotected data. SDL is rarely ignorable. If the researcher has knowledge of the SDL model, she can conduct an SDL-aware analysis that explicitly corrects for the effects of SDL. If, as is often the case, if the SDL model is unknown, we describe circumstances under which SDL can still be learned.
منابع مشابه
Privacy and Statistical Risk: Formalisms and Minimax Bounds
We explore and compare a variety of definitions for privacy and disclosure limitation in statistical estimation and data analysis, including (approximate) differential privacy, testingbased definitions of privacy, and posterior guarantees on disclosure risk. We give equivalence results between the definitions, shedding light on the relationships between different formalisms for privacy. We also...
متن کاملUsing Noise for Disclosure Limitation of Establishment Tabular Data
We propose a new disclosure limitation method for establishment magnitude tabular data in which noise is added to the underlying microdata prior to tabulation. The proposed method has several advantages compared to the standard method of cell suppression: it enables some information to be provided within more cells of the table, it eliminates the need to coordinate cell suppression patterns bet...
متن کاملComparison of Remote Analysis with Statistical Disclosure Control for Protecting the Confidentiality of Business Data
This paper is concerned with the challenge of allowing statistical analysis of confidential business data while maintaining confidentiality. The most widely-used approach to date is statistical disclosure control, which involves modifying or confidentialising data before releasing it to users. Newer proposed approaches include the release of multiply imputed synthetic data in place of the origi...
متن کاملPatient 's Bedside Teaching: Advantages and Disadvantages
Introduction:Nursing education includes both theoretical and clinical areas, therefore it has special features and problems of its own. One is establishing integration between theoretical and clinical training . Patient's bedside teaching is considered to be an important component of clinical education. Time spent with the patient is full of visual, auditory and tactile experiences and there...
متن کاملDirect Observation of Procedural Skills (DOPS) in Restorative Dentistry: Advantages and Disadvantages in Student's Point of View
Introduction: Direct Observation of Procedural Skills (DOPS) is a valuable method in evaluation of clinical procedures. The aim of this study was to evaluate the advantages and disadvantages of this evaluation method from the restorative dentistry students' point of view in Mashhad University of Medical Sciences Methods: This cross-sectional study was conducted on the students of restorative d...
متن کامل